These exercise cover the scales, statistics and themes of ggplot2 for Plotting in R.
Exercise 1 - Themes
Read in the cleaned patients dataset as we saw in ggplot2 course earlier (“patients_clean_ggplot2.txt”)
Set the global theme to use theme_bw(). Using the patient data set generate a scatter plot of BMI versus Weight. Add a color scale to the scatter plot based on the Pet variable. Use an additional geom to add an extra layer of a fit line to our scatterplot (use lm method). Lets also add a nice color palette of your choice from Colorbrewer, Paleteer or Viridis.
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
Take at least one element from each theme you just tried out and add it to our existing theme.
Use the + to update our plot to use the new theme
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Saving 7 x 5 in image
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
Exercise 2 - External Packages
## Warning: namespace 'DESeq2' is not available and has been replaced
## by .GlobalEnv when processing object 'deseq_pca_example'
## PC1 PC2 group condition name
## sample1 -13.24423 22.0512418 A Ctrl sample1
## sample2 -14.40362 -13.8514994 A Ctrl sample2
## sample3 -14.79663 -2.9299797 A Ctrl sample3
## sample4 -15.12221 -9.4008058 A Mut sample4
## sample5 -14.78719 -0.2914010 A Mut sample5
## sample6 -13.77777 5.6896212 A Mut sample6
## sample7 13.68499 -2.4219808 B Ctrl sample7
## sample8 14.85294 3.9328670 B Ctrl sample8
## sample9 12.21286 -3.0234083 B Ctrl sample9
## sample10 16.25675 -5.6924350 B Mut sample10
## sample11 14.85222 0.9620356 B Mut sample11
## sample12 14.27188 4.9757444 B Mut sample12
## Aesthetic mapping:
## * `colour` -> `group`
## * `x` -> `PC1`
## * `y` -> `PC2`
Exercise 3 - Interactive Plots
## Warning in geom_point(aes(label = name)): Ignoring unknown aesthetics: label
Exercise 4 - Working Example
In this final exercise we will run through a common example: making a volcano plot.
A volcano plot consists of: 1) log2FC on the x axis 2) -log10(pval) on the y axis
It is also good to add some additional customization: - Highlight significant genes i.e. pval <0.05 - Highlight genes above a certain threshold log2FC i.e. >1 - Add lines to denote these thresholds - Label some genes of interest directly: “Gm8714”,“Pas1b”,“Rab39”,“Tmc2”,“Ttpal”,“Ctdsp1” - Use a simple theme to also give the plot a simple look - Export the plot as a pdf
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_text()`).
## Saving 7 x 5 in image
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Removed 2 rows containing missing values or values outside the scale range
## (`geom_text()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_text_repel()`).
myplot <- ggplot(my_res, aes(x=log2FoldChange,
y=-log10(pvalue),
color = ifelse(pvalue > 0.05, "NS",
ifelse(log2FoldChange > 1, "SigUp",
ifelse(log2FoldChange < (-1), "SigDown", "Sig"))))) +
geom_point(size=0.5, alpha=0.5) +
scale_color_manual(name = "Significance", breaks=c("NS", "Sig", "SigUp", "SigDown"), values = c("black", "green","blue", "red")) +
theme_bw() +
ggtitle("Volcano Plot showing significance of \ngene expression changes following DESeq analysis") +
geom_hline(yintercept=(-log10(0.05)), lty =3, color="gray") +
geom_vline(xintercept=c((-1),1), lty =3, color="gray") +
theme(text = element_text(size = 8))
ggplotly(myplot + geom_point(aes(text = SYMBOL )), source = "select", tooltip = c("SYMBOL"))## Warning in geom_point(aes(text = SYMBOL)): Ignoring unknown aesthetics: text